NMS-19648: CircleCI: Store failed result of flaky tests, improve a test case by mershad-manesh · Pull Request #8404 · OpenNMS/opennms

mershad-manesh · 2026-03-30T13:15:52Z

All Contributors

Have you read our Contribution Guidelines?
Have you (electronically) signed the OpenNMS Contributor Agreement?

External References

Jira (Issue Tracker): https://opennms.atlassian.net/browse/NMS-19648

…case failures

This reverts commit 35548f2.

Copilot

Pull request overview

This PR focuses on improving CI reliability and debuggability around flaky tests by preserving failure evidence across retries, refining CircleCI smoke/itest distribution, and stabilizing a few UI component tests.

Changes:

Update CircleCI smoke/itest scripts to preserve failing JUnit XMLs as “flaky evidence” before retrying, and improve test splitting/logging.
Introduce a quarantined smoke-test-flaky CircleCI job/suite and adjust existing smoke job parallelism/config behavior.
Stabilize several Vue/Vitest UI tests (dialog stubbing, cleanup improvements, and debounce handling) and harden Selenium clicking via retry/obscured-element checks.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
ui/tests/components/EventConfigurationDetail/EventConfigEventTable.test.ts	Adds dialog mocking/cleanup and switches debounce verification to `flush()` approach.
ui/tests/components/EventConfiguration/EventConfigSourceTable.test.ts	Adds stubs for dialogs, improves teardown, and updates debounce tests to use `flush()`.
ui/tests/components/EventConfiguration/Dialog/UploadedFileRenameDialog.test.ts	Mocks FeatherDialog and improves teardown by closing dialog before unmount.
ui/tests/components/EventConfiguration/Dialog/CreateEventConfigurationDialog.test.ts	Mocks FeatherDialog and adjusts teardown/visibility handling to reduce flakiness.
ui/tests/components/EventConfigEventCreate/BasicInformation.test.ts	Ensures fake timers are always restored via `try/finally` and adds afterEach timer reset.
smoke-test/src/test/java/org/opennms/smoketest/GrafanaEndpointPageIT.java	Uses `clickElement(By...)` helper to reduce click flakiness.
smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Toggle.java	Switches to retrying click helper for toggles.
smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Element.java	Adds `clickWithRetry` with scroll + obscured-element detection.
smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Button.java	Switches to retrying click helper for buttons.
smoke-test/src/main/java/org/opennms/smoketest/selenium/AbstractOpenNMSSeleniumHelper.java	Enhances `clickElement` with scroll + obscured-element detection.
.circleci/scripts/smoke.sh	Filters ITs by suite category, simplifies docker pre-pull, and preserves failing XMLs before retries.
.circleci/scripts/itest.sh	Adds node summary logging, preserves failing XMLs before retries, and increases db create threads.
.circleci/scripts/find-tests/git.py	Fails fast with a clearer error when `.nightly` lacks `parent_branch`.
.circleci/scripts/find-tests/find-tests.py	Fixes shebang and removes noisy debug printing.
.circleci/pyscripts/generate_main.py	Reduces workflow `max_auto_reruns`.
.circleci/main/workflows/workflows_v2.json	Adds workflow entry for quarantined flaky smoke tests.
.circleci/main/jobs/tests/smoke/smoke-test.index	Registers the new `smoke-test-flaky` job.
.circleci/main/jobs/tests/smoke/smoke-test-sentinel.yml	Increases sentinel smoke parallelism.
.circleci/main/jobs/tests/smoke/smoke-test-minion.yml	Increases minion smoke parallelism.
.circleci/main/jobs/tests/smoke/smoke-test-flaky.yml	Adds a quarantined flaky smoke job (allow failures, retries).
.circleci/main/jobs/tarball-assembly-only.yml	Adjusts resource defaults and Maven heap settings.
.circleci/main/commands/oci/trivy-analyze.yml	Removes a redundant/incorrect step from the Trivy analyze command.
.circleci/main/commands/generic/generic.yml	Avoids repeated apt installs by checking package presence first.
.circleci/main/commands/executions/run-smoke-tests.yml	Adds allow-failures mode and collects/stores flaky-evidence artifacts.
.circleci/main/commands/executions/run-integration-tests.yml	Collects/stores flaky-evidence artifacts for integration tests.
.circleci/main/commands/executions/run-build.yml	Ensures integration tests are skipped for the build-only compile step.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Element.java

smoke-test/src/main/java/org/opennms/smoketest/selenium/AbstractOpenNMSSeleniumHelper.java

cgorantla

Looks fine to me

.circleci/main/jobs/tests/smoke/smoke-test-sentinel.yml

Copilot

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-30T17:27:08Z

.circleci/scripts/smoke.sh

+          | xargs grep -L 'MinionTests\|SentinelTests\|FlakyTests' \
+          | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
+          > failsafe_classnames
+        ;;
+      minion)
+        circleci tests glob '**/src/test/java/**/*IT*.java' \
+          | xargs grep -l 'MinionTests' \
+          | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
+          > failsafe_classnames
+        ;;
+      sentinel)
+        circleci tests glob '**/src/test/java/**/*IT*.java' \
+          | xargs grep -l 'SentinelTests' \
+          | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
+          > failsafe_classnames
+        ;;
+      flaky)
+        circleci tests glob '**/src/test/java/**/*IT*.java' \
+          | xargs grep -l 'FlakyTests' \
+          | sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
+          > failsafe_classnames


find_tests() pipes circleci tests glob into xargs grep .... When grep -l/-L finds no matches (or when the glob yields no files), grep exits non-zero and, with set -e -o pipefail, this will abort the script even though an empty test list should be valid. Consider guarding this stage (e.g., use xargs -r and/or wrap the grep stage with || true in a subshell) so failsafe_classnames can be empty without failing the job.

Suggested change

| xargs grep -L 'MinionTests\|SentinelTests\|FlakyTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames

;;

minion)

circleci tests glob '**/src/test/java/**/*IT*.java' \

| xargs grep -l 'MinionTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames

;;

sentinel)

circleci tests glob '**/src/test/java/**/*IT*.java' \

| xargs grep -l 'SentinelTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames

;;

flaky)

circleci tests glob '**/src/test/java/**/*IT*.java' \

| xargs grep -l 'FlakyTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames

| xargs -r grep -L 'MinionTests\|SentinelTests\|FlakyTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames || true

;;

minion)

circleci tests glob '**/src/test/java/**/*IT*.java' \

| xargs -r grep -l 'MinionTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames || true

;;

sentinel)

circleci tests glob '**/src/test/java/**/*IT*.java' \

| xargs -r grep -l 'SentinelTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames || true

;;

flaky)

circleci tests glob '**/src/test/java/**/*IT*.java' \

| xargs -r grep -l 'FlakyTests' \

| sed -e 's#^.*src/test/java/$.*$\.java#\1#' | tr "/" "." \

> failsafe_classnames || true

.circleci/main/commands/executions/run-smoke-tests.yml

.circleci/main/commands/executions/run-integration-tests.yml

Copilot · 2026-03-30T17:27:10Z

ui/tests/components/EventConfigurationDetail/EventConfigEventTable.test.ts

-      // Before debounce time
+    it('does not call store immediately on input (debounce)', () => {
+      wrapper.vm.onChangeSearchTerm('test')
      expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()


This debounce test calls onChangeSearchTerm('test') but never flushes/cancels the debounced function. With real timers enabled, the pending debounce may fire after the test completes and bleed into subsequent tests (mock call counts/state). Cancel (or flush) the debounced function at the end of this test or in afterEach (e.g., onChangeSearchTerm.cancel()).

Suggested change

expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()

expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()

wrapper.vm.onChangeSearchTerm.cancel()

cgorantla

LGTM!

synqotik

Overall looks good, just one change to make.

synqotik · 2026-04-09T13:58:24Z

smoke-test/src/main/java/org/opennms/smoketest/selenium/AbstractOpenNMSSeleniumHelper.java

+                element);
+
+        return result instanceof Boolean && (Boolean)result;
+    }


Agree with this (Copilot comment), isCenterPointObscured should only appear once. I'm thinking make it a public static function in Element, which would actually look more like the implementation here. The Javascript checks if the element is null so you don't have to do that explicitly in the Java code.

Copilot

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T15:41:06Z

.circleci/scripts/smoke.sh

+      | xargs -I{} cp {} "${FLAKY_EVIDENCE_DIR}/"
+    # Now delete originals so fresh results are written by the retry
    find . \( -path "*/failsafe-reports/TEST-*.xml" -o -path "*/surefire-reports/TEST-*.xml" \) \
      -exec grep -l -E 'failures="[1-9]|errors="[1-9]' {} + 2>/dev/null \
      | xargs rm -f


The cp/rm pipelines use xargs without -r/--no-run-if-empty, which can still invoke cp/rm with no arguments when no failing XMLs are found (leading to noisy errors, even though set +e prevents failure). Consider adding xargs -r (and optionally -0 with find -print0) to make this robust and quiet.

Suggested change

| xargs -I{} cp {} "${FLAKY_EVIDENCE_DIR}/"

# Now delete originals so fresh results are written by the retry

find . $ -path "*/failsafe-reports/TEST-*.xml" -o -path "*/surefire-reports/TEST-*.xml" $ \

-exec grep -l -E 'failures="[1-9]|errors="[1-9]' {} + 2>/dev/null \

| xargs rm -f

| xargs -r -I{} cp {} "${FLAKY_EVIDENCE_DIR}/"

# Now delete originals so fresh results are written by the retry

find . $ -path "*/failsafe-reports/TEST-*.xml" -o -path "*/surefire-reports/TEST-*.xml" $ \

-exec grep -l -E 'failures="[1-9]|errors="[1-9]' {} + 2>/dev/null \

| xargs -r rm -f

.circleci/scripts/itest.sh

ui/tests/components/EventConfigurationDetail/EventConfigEventTable.test.ts

ui/tests/components/EventConfiguration/EventConfigSourceTable.test.ts

Copilot

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T16:18:03Z

smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Element.java

+            final WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10), Duration.ofMillis(200));
+            wait.ignoring(StaleElementReferenceException.class).until(webDriver -> {
+                final WebElement element = webDriver.findElement(by);
+                ((JavascriptExecutor)webDriver).executeScript("arguments[0].scrollIntoView({block: 'center', inline: 'center'});", element);
+


clickWithRetry uses webDriver.findElement(by) inside WebDriverWait.until(...) but does not ignore NoSuchElementException. Since implicit wait is set to 0, a briefly-missing element will throw immediately and abort the wait instead of retrying, which can reintroduce flakiness. Consider adding NoSuchElementException to the ignored exceptions (and/or using an ExpectedCondition that returns false/null when the element is absent).

.circleci/main/commands/executions/run-smoke-tests.yml

.circleci/main/commands/executions/run-integration-tests.yml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

mershad-manesh added 28 commits March 17, 2026 14:16

increase the dbCreateThreads to 8

99b5375

Build using 1 thread per core

7d548d1

use the built in no-transfer-progress

c2b9f54

try to improve the smoke tests and how we build

1941e58

undo change to build

6fa2213

deal with Grafana IT failing due to another element blocking it

3e49d92

add retry functionalities... hopefully this reduces the selenium test…

0f9874d

…case failures

remove vi and fakeTimer

215a5f9

add failsafe

e175883

more changes

464c4f1

more improvements

c747e23

does this work?

9a01eb7

remove fakeTimer

93a8950

enable smoke tests

2d145d9

tweak the integration test script

ac2f8ac

improve logging and calling out errors

fc6c65a

set the id,location,broker_url

35548f2

Revert "set the id,location,broker_url"

b74816f

This reverts commit 35548f2.

Merge remote-tracking branch 'origin/develop' into mem/improvements

36c55a3

Merge remote-tracking branch 'origin/develop' into mem/improvements

f27c1b2

Merge remote-tracking branch 'origin/develop' into mem/improvements

f5d1c06

parallel tweaks

313788c

reduce redundant calls

fb15769

Merge remote-tracking branch 'origin/develop' into mem/improvements

60e4f1d

tweak smoke tests

f1a2068

don't hide flaky test failures, even if they pass on re-run

00577bd

disable flaky test job for now

622c70b

delete build trigger file

ec43484

mershad-manesh requested review from cgorantla and synqotik March 30, 2026 13:15

mershad-manesh requested a review from Copilot March 30, 2026 13:15

Copilot started reviewing on behalf of mershad-manesh March 30, 2026 13:16 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Element.java Outdated Show resolved Hide resolved

smoke-test/src/main/java/org/opennms/smoketest/selenium/AbstractOpenNMSSeleniumHelper.java Show resolved Hide resolved

cgorantla previously approved these changes Mar 30, 2026

View reviewed changes

.circleci/main/jobs/tests/smoke/smoke-test-sentinel.yml Outdated Show resolved Hide resolved

reduce the parallelism to 8 across smoke tests

89c0e52

mershad-manesh dismissed cgorantla’s stale review via 89c0e52 March 30, 2026 14:58

mershad-manesh added 2 commits March 30, 2026 11:23

revert core and minion parallelism back to 10 !smoke

e8d8e90

set implicit wait to 0 and then set it back after we are done.

54298bf

mershad-manesh requested a review from Copilot March 30, 2026 17:21

Copilot started reviewing on behalf of mershad-manesh March 30, 2026 17:22 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

cgorantla previously approved these changes Mar 30, 2026

View reviewed changes

force a build !smoke

b43e6dd

mershad-manesh dismissed cgorantla’s stale review via b43e6dd March 30, 2026 18:04

synqotik requested changes Apr 9, 2026

View reviewed changes

mershad-manesh added 2 commits April 9, 2026 11:24

Merge branch 'develop' into mem/improvements

2e2adb1

Implement few improvements

858fbab

mershad-manesh requested a review from Copilot April 9, 2026 15:34

Copilot started reviewing on behalf of mershad-manesh April 9, 2026 15:35 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

fix an issue with itest and unskip few UI tests with some tweaks

971a8f0

mershad-manesh requested a review from Copilot April 9, 2026 16:12

Copilot started reviewing on behalf of mershad-manesh April 9, 2026 16:13 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

mershad-manesh and others added 3 commits April 9, 2026 12:50

Update .circleci/main/commands/executions/run-smoke-tests.yml

7feeccd

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update .circleci/main/commands/executions/run-integration-tests.yml

3f59e4f

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update epoch

6ce04bb

mershad-manesh requested review from cgorantla and synqotik April 9, 2026 18:58

	expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()
	expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()
	wrapper.vm.onChangeSearchTerm.cancel()

Conversation

mershad-manesh commented Mar 30, 2026

All Contributors

External References

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

cgorantla left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

cgorantla left a comment

Choose a reason for hiding this comment

Uh oh!

synqotik left a comment

Choose a reason for hiding this comment

Uh oh!

synqotik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants